Real-world Cardiovascular Outcomes Associated With Degarelix vs Leuprolide for Prostate Cancer Treatment

Key Points Question Can real-world data regarding the use of degarelix and leuprolide be used to emulate the forthcoming PRONOUNCE trial, a phase 3b trial comparing the cardiovascular safety of degarelix vs leuprolide among patients with prostate cancer and cardiovascular disease? Findings In this cohort study of 2226 propensity score–matched men with prostate cancer taking degarelix or leuprolide, no significant difference was observed in the risk of a major adverse cardiovascular event. Meaning These findings suggest that real-world data are increasingly available and useful for medical product evaluation, including for emulating clinical trials to understand products’ use in clinical practice and the associated benefits and harms of treatment.

Histologically confirmed adenocarcinoma of the prostate Tumor, node, metastasis staging available prior to treatment start (bone scan and/or CT scan and/or MRI) <12 weeks prior to study start. If no radiographic image is available at the time of screening, a bone scan should be performed NA: Already defined in the initial cohort as: Patients must have at least one "Evaluation and Management" visit with a diagnosis of prostate cancer within 6 months before index data, at least one "Evaluation and Management" visit with a diagnosis of prostate cancer at any time after the index date Sensitivity analysis: prostate biopsy required Indication to initiate androgen deprivation therapy (ADT) NA: 'Index date' is the first fill of degarelix or leuprolide* *For the leuprolide cohort, we will allow up to one month of bicalutamide prior to leuprolide initiation Predefined cardiovascular disease inclusion criteria Pre-existing ASCVD (confirmed diagnosis, documented) according to at least one of the following criteria: Prior myocardial infarction >=30 days before randomization; prior revascularization procedure >=30 days before randomization); Coronary artery: stent placement/balloon angioplasty or coronary artery bypass graft surgery; Coronary artery: stent placement/balloon angioplasty or endarterectomy surgery; Iliac, femoral, popliteal arteries: stent placement/balloon angioplasty or vascular bypass surgery At least one vascular stenosis >=50% at any time point before To establish history of cardiovascular disease using claims data, it is common to use both primary and secondary discharge diagnosis and procedure codes. In particular, we will identify the following using discharge diagnosis and procedure codes as indicators of a history of cardiovascular disease, >= 30 days before the index date: Myocardial infarction, percutaneous coronary intervention (PCI), coronary artery bypass grafting (CABG), peripheral artery revascularization, carotid revascularization any position To identify recent/active cardiovascular events, we will use primary diagnosis of myocardial infarction and stroke (emergency department or inpatient visits) within 30 days before index date, as patients hospitalized for an acute cardiovascular event would be expected to have these diagnoses listed as the primary discharge diagnosis, while patients with a history of cardiovascular disease would be expected to have these diagnoses listed as secondary discharge diagnoses.
Percutaneous coronary intervention (PCI), coronary artery bypass grafting (CABG), peripheral artery revascularization (limb events), carotid revascularization within 30 days before index date Planned or scheduled cardiac surgery or PCI procedure that is known at the time of randomization NA: We cannot determine planned or scheduled PCI procedures.
Ankle-brachial pressure index <0.9 at any point before randomization NA: Restricting patients with laboratory data would limit the sample. ASCVD = atherosclerotic cardiovascular disease; CT = computed tomography; DBP = diastolic blood pressure; HbA1c = hemoglobin A1c; NA = not applicable; SBP = systolic blood pressure;

Study inclusion and exclusion criteria
We will apply the PRONOUNCE trial inclusion and exclusion criteria listed on ClinicalTrials.gov, which we will update with information made available by the PRONOUNCE trial authors, 14 to patients represented in OptumLabs data ( Table 1). We will not restrict the realworld sample to the planned sample size for the trial, but rather include all patients who otherwise meet the eligibility criteria. However, we will apply successive inclusion, and then exclusion criteria, determining which criteria have the biggest impact on the size of the population of patients observed in real-world data ( Table 2).
The PRONOUNCE trial includes male patients, without any age restrictions, with advanced prostate cancer and cardiovascular disease, who were treated with degarelix (Firmagon) or leuprolide (Lupron depot) ( Table 3). We will first identify all patients who initiated degarelix and leuprolide between 12/24/2008 and 6/30/2019. The start date was selected because degarelix received FDA approval on 12/24/2008. The date of an individual's first treatment (first fill date) with degarelix or leuprolide will be defined as the index date.  We will then identify all male enrollees, without any age restrictions, with valid demographic (age and race/ethnicity) and residence data. All enrollees will be required to have at least 6 months of continuous enrollment with medical and pharmacy coverage (i.e. no more than 45 days gap in coverage) before the index date, in order to capture an adequate prior medical history.
We developed an algorithm to identify enrollees with prostate cancer based on clinical expertise and similar methodology outlined in previous studies, which reported positive predictive values between 70% and 82%. [15][16][17][18] Specifically, we will require patients to have at least one Evaluation and Management (E&M) visit with a diagnosis of prostate cancer within 6 months before the index date and at least one E&M visit with a diagnosis of prostate cancer any time after the index date. We are not able to ascertain prostate cancer severity, so no categorization by prostate cancer grade will be possible. For our primary analysis, a prostate biopsy will not be required, since patients may receive a diagnosis from a biopsy of metastatic site (e.g. bone or lymph node). As a secondary analysis, we will restrict to a subcohort of patients with at least one prostate biopsy.
To identify patients with a history of cardiovascular disease, it common to use both primary and secondary discharge diagnosis and procedure codes. This is an established method for administrative claims data research and used extensively for cohort creation and quality measurement by the Centers for Medicare & Medicaid Services (CMS). 19 In particular, we will identify the following using discharge diagnosis and procedure codes as indicators of a history of cardiovascular disease, at least 30 days before the index date: myocardial infarction, percutaneous coronary intervention (PCI), coronary artery bypass grafting (CABG), peripheral artery revascularization, carotid revascularization any position.
Among patients with prostate cancer and pre-defined cardiovascular disease, we will use pharmacy claims data to exclude patients with a prescription fill of ADT medications within 6 months before the index date. However, for the leuprolide cohort, we will allow patient to remain eligible for inclusion even if they received bicalutamide within one month prior to leuprolide initiation. Leuprolide, a GnRH agonist, can paradoxically lead to a transient increase in testosterone during the first 1 to 3 weeks of treatment. Therefore, bicalutamide is often given for a few weeks before the initial leuprolide injection in order to block any potential adverse effects from the testosterone flare. Degarelix, a GnRH antagonist, does not produce a testosterone flare.
Lastly, we will exclude patients with recent/active cardiovascular events. We will use primary diagnosis of myocardial infarction and stroke (emergency department or inpatient visits) within 30 days before index date, as patients hospitalized for an acute cardiovascular event would be expected to have these diagnoses listed as the primary discharge diagnosis, while patients with a history of cardiovascular disease would be expected to have these diagnoses listed as secondary discharge diagnoses.

Baseline characteristics
We will record and summarize key baseline characteristics, including socio-demographic characteristics, comorbidities, and prior and concurrent medication use ( Update August 2020: After accessing the data and running preliminary analyses, we updated our protocol to account for additional baseline comorbidities (italicized and underlined in Table 4). These comorbidities were selected to account for residual confounding by severity of disease. In particular, we observed that degarelix was paradoxically associated with increased mortality. To further account for observed imbalances between the degarelix and leuprolide patients, we also matched on state.
Medical history will be determined using patients' physician, facility, and pharmacy claims before or on the index date. We will use all data available to us to establish patients' medical history ( between labs testing facilities and the OptumLabs Data Warehouse. For the patients with laboratory data, we will determine serum prostate-specific antigen (PSA) levels and estimated Glomerular Filtration Rate (eGFR). We will also determine whether patients had a prostate biopsy or received radiotherapy within 6 months before the index date. Previous treatment with bicalutamide and other baseline medications will be determined 6 months prior to index date.     IL  IN  MA  MN  MO  NC  NJ  NY  OH  RI  SC  TN  TX  UT  VA  WI OTHER ACEi, angiotensin-converting-enzyme inhibitors; ARB, angiotensin II receptor blocker; CABG, coronary artery bypass grafting; COPD, chronic obstructive pulmonary disease; DOAC, direct-acting oral anticoagulants; IQR, interquartile range; MI = myocardial infarction; PAD, peripheral artery disease; PCI, percutaneous coronary intervention a Underlined values were added after data were accessed

Follow-up and outcome ascertainment
OptumLabs Data Warehouse is continuously updated on a monthly basis and the data are complete within 6 months of the service being provided. The analyses of this study will be performed from May to September 2020, implying that the most recent data available to us will be up to October, 2019. Therefore, patients will be followed until the end of the study period (07/31/2019), the end of enrollment in health insurance plans, or death, whichever is first.

Study outcomes
We will use similar primary and secondary endpoints as the PRONOUNCE trial (Box 1).
The primary endpoint in the PRONOUNCE trial is the time to first occurrence of the composite Major Adverse Cardiovascular Event (MACE) endpoint, defined as death due to any cause, non-fatal myocardial infarction, or non-fatal stroke. The secondary endpoints in the PRONOUNCE trial include: time from randomization to occurrence of fatal and non-fatal myocardial infarction, fatal and non-fatal stroke, fatal and non-fatal unstable angina requiring hospitalization, and cardiovascular-related death as separate outcomes. Using OptumLabs data, we are able to determine stroke, myocardial infarction, and angina, but are unable to distinguish between fatal and non-fatal events ( Table 5). However, we will use commonly used, published, and previously validated diagnosis and procedure codes for MACE. For instance, previous evaluations suggest that the performance of similar MACE outcome codes are relatively good, with positive predictive values between 88.4% and 94% for myocardial infarction, 85% for ischemic stroke, and 80%-98% for hemorrhagic stroke. 21-25 particularly since most deaths occur in an institutional setting. We acknowledge that a small proportion of patients who died out of hospital and were not captured by Death Master File could be missing, however, this should be non-differential between treatment groups and should not influence our comparison.

Study follow-up
For each patient, we will also determine the follow-up time, which will start the day after initiation of degarelix or leuprolide. Follow-up will continue until the date when the patient experiences any of the following events:

Missing data
Patients will be considered to have a condition, comorbidity, outcome, or drug exposure if they have a corresponding claim, and will be considered not having a comorbidity, outcome or drug exposure if they do not have a corresponding claim. Although we will therefore not have missing comorbidities, drug use, or outcomes data, misclassification may exist. While this is a limitation of using claims data, the algorithms used to define our inclusion/exclusion criteria, outcomes of interest, and important covariates are commonly used and have demonstrated good performance in previous studies. We suspect that any existing misclassification will be unrelated to treatment group and should not meaningfully impact our findings.
We will exclude patients with invalid demographic data during the cohort creation process (e.g., missing residence region or inconsistent birth year). However, we anticipate fewer than 1% of patients being excluded during the cohort creation. For race/ethnicity, the categories in the database are non-Hispanic white, non-Hispanic black, Hispanic, Asian, other and unknown. The other and unknown will be used as a separate category in the propensity score model.

Main analysis using OptumLabs cohort
For our primary analyses, we will focus on OptumLabs patients who would be eligible for PRONOUNCE based on the operational definitions of the inclusion and exclusion criteria in Table 1 (base population).
Propensity score matching will be used to balance the difference in baseline characteristics between patients who received degarelix versus those who received leuprolide.
A propensity score, the probability of receiving degarelix, will be estimated using a logistic regression model which includes patient characteristics presented in Table 4. No interaction terms will be used. One-to-one nearest neighborhood caliper matching will be used to match patients based on the logit of the propensity score using a caliper equal to 0.2 of the standard deviation of the logit of the propensity score. 27 Standardized differences will be used to assess the balance of covariates after matching and a standardized difference within 0.1 will be considered acceptable. 28 Covariates with standardized differences above 0.1 will be adjusted for in the regression models.
Cox proportional hazards regression will be used to compare patients receiving degarelix versus those who received leuprolide for the primary and secondary outcomes in the propensity matched cohort, with robust sandwich estimates to account for the clustering within matched sets. 29 The proportional hazard assumption will be tested on the basis of Schoenfeld residuals. 30 If the proportional hazard assumption is not met, we will assess alternative time to event models, including parametric models, using Akaike information criterion and Bayesian information criterion to determine the final model specification. The Fine and Gray method will be used to consider death as a competing risk when assessing non-fatal outcomes. 31 All primary analyses will compare the assigned treatment groups under the intention-to-treat principle.
All analyses will be conducted using SAS 9.4 (SAS Institute Inc., Cary, North Carolina) and Stata 16 (Stata Corp, College Station, TX).

Subgroup Analyses
First, we will repeat our analyses restricted to a subcohort of patients with at least one prostate biopsy. Next, we will perform subgroup analyses for the primary outcome stratified by age, race, diabetes mellitus, and renal function, using receipt of hemodialysis to identify patients with end-stage renal disease. In addition, for the patients with laboratory data, we will generate subgroups of patients with eGFR <45 and >45. The subgroup analyses will be performed separately in patients who were eligible for the trial (primary analysis) and patients who failed to meet the inclusion criterion/exclusion criteria (secondary analyses). Within each subgroup, we will re-examine the standardized differences to assess the balance of the covariates. If the majority of the standardized differences are above 0.01, we will rematch the patients within each subgroup. Since an increasing number of subgroup analyses could increase the chance of false positive results, we pre-specified the above subgroups since they are either key demographic characteristics or risk factors strongly associated with the primary outcome. However, we will not perform any adjustment for multiple testing.

Sensitivity Analyses
We will conduct the following sensitivity analyses to assess the robustness of the findings: 1. We will repeat our analyses across two subgroups: (1) patients who failed to meet any one of the cardiovascular inclusion criteria for PRONOUNCE; and (2)  3. We will conduct a stratified analysis based on the adherence to degarelix and leuprolide, i.e., patients with proportion of days covered (PDC)<80% and those with PDC≥80%, since the adherence to medical therapy in practice is often lower than that in clinical trials. The adherence will consider all drugs that a patient used during follow up, even if they were different from the initial treatment. For degarelix, which is administered monthly, we will use 30 days supply. For leuprolide, there are multiple dosing intervals, which makes timing of the next dose dependent on the dose given at the last injection.
For leuprolide patients without dose information, we will assume that fills were for 30 days. After matching the degarelix patients with PDC≥80% with the leuprolide patients with PDC<80%, we will conduct the cox regression analyses to compare the outcomes between the two groups. Analyses will be repeated among patients with PDC<80%.
4. We will assess falsification endpoints to test for residual confounding. Treatment effects estimated in observational studies are prone to unmeasured confounding. In recent years, a falsification end point, also called a control outcome, has become a popular method to assess for unmeasured confounding. 32-34 A falsification endpoint is a health outcome that researchers believe is highly unlikely to be casually related to the treatment in question. If a significant relationship is found between the treatment and a falsification endpoint, it may indicate the treatment groups are different in some unmeasured ways, i.e. the existence of unmeasured confounding. This method is similar to a negative control, a routine precaution taken in the design of biologic laboratory experiments, and is recommended to be used to detect confounding and bias in observational studies. 33,35,36 This method is particularly useful in observational studies comparing different treatment options, because the unmeasured confounding in these studies tend to make one group systematically healthier or less susceptible to adverse outcomes than the other group.
We selected two endpoints that that are unlikely to be associated with use of either degarelix or leuprolide -chronic obstructive pulmonary disease and appendicitis/cholecystitis. If a significant relationship were to be found between degarelix and any of these endpoints, it would indicate the existence of residual confounding.

Comparison of cohort and trial population characteristics and results
Once the ongoing trials have been completed, we will compare the trial population to the population of patients identified in the claims component of OptumLabs data after application of the pre-specified eligibility criteria, as described above, to determine how accurately the characteristics of trial populations can be predicted. For each individual characteristic included in Table 4 and reported in the PRONOUNCE trial publication, we will make pairwise comparisons between the trial population and the population of patients identified in the claims component of OptumLabs, stratified by treatment arm. In particular, within the degarelix and leuprolide arms, we will take the paired differences between the standardized mean differences (Cohen's d) from the trial population and the real-world population. Differences between standardized mean differences within 0.2 will be considered acceptable.
We will also compare rates of missing data and loss to follow-up across arms. If we observe significant differences in the characteristics of the real-world and trial populations, we will estimate a real-world population reweighted to mirror the characteristics of the PRONOUNCE trial population.
Once the PRONOUNCE trial have been completed and published, we will compare the final primary and secondary endpoint results to the results estimated from A) the population of real-world patients meeting pre-specified trial eligibility criteria and (when needed as explained above) B) the population of real-world patients reweighted to mirror the characteristics of the final enrolled population of patients in the trial, for each analytic approach employed as described above. As needed, we will also compare each of the observational approaches used above to the RCT results, providing a better understanding of the tradeoffs inherent to each of the proposed methods.
Two approaches for comparing results will be used. First, as a simple method, results from both the real-world data and the trial will be characterized as positive (i.e., degarelix statistically significantly reduces the risk of cardiovascular complications as compared to leuprolide), neutral (no statistically significant difference between degarelix and leuprolide), or negative (degarelix statistically significantly increases the risk of cardiovascular complications as compared to leuprolide) and a percent agreement will be estimated. Statistical tests will be 2sided and significance will be set at P < 0.05.
Second, we will pursue a more sophisticated method. The hazard ratios calculated for the primary and secondary outcomes using the real-world data will be converted to natural logarithm hazard ratios (lnHR). For each outcome, we will then take the difference between the lnHR calculated using the real-world data and the lnHR reported by the PRONOUNCE trial.
After exponentiating each difference, a ratio of hazard ratios greater than 1.0 will imply greater (more beneficial) treatment effects in the real-world population than in the PRONOUNCE trial population. We will calculate 95% confidence intervals for the ratios of the hazard ratios by taking the square root of the sum of the variance for the hazard ratio derived from the real-world data and the variance for the hazard ratio from the PRONOUNCE trial population. Our variance calculations will be based on assumption of independence (i.e. correlation coefficients of zero, indicating that the outcomes from the real-world data and trial data are independent   661 (12.8 %) ACEi, angiotensin-converting-enzyme inhibitor; ARB, angiotensin II receptor blocker; CABG, coronary artery bypass grafting; COPD, chronic obstructive pulmonary disease; DOAC, direct-acting oral anticoagulant; IQR, interquartile range; MI = myocardial infarction; PAD, peripheral artery disease; PCI, percutaneous coronary intervention; PSA prostate-specific antigen a Patients with PSA and patients without PSA were matched separately b eGFR was used for the subgroup analyses and not for matching purposes * Cell suppression based on OptumLabs Cell Size Suppression rules. N<11 are masked to protect patient confidentiality